Purpose and Usage
For TextGrid 2.0, a special XML-based Service Description language has been developed. Established service description languages like WSDL or WADL both contain too many details and are not sufficient for the purposes of TextGrid. Some points for motivation:
- WSDL/WADL can contain more than one operation. However, in a workflow setting, the service must be launched with a certain operation specified
- There is no semantic distinction between input data and configuration parameters. However, in TextGrid, batch processing on input data and fixed configuration parameters are requirements
- WSDL uses XSD-based types for parameters, however, TextGrid uses a Mime-Type based model
- From a WSDL, a user only knows the type (e.g. "xsd:string") but not the possible values of a parameter (e.g. "UTF-8"). These need to be guessed or documented by separate means. However, having some of the possible values in a service description makes usage easier
- English names for input and output parameters and specific configuration values are desirable
- descriptive metadata, information about a license, etc., for a service are desirable
Additionally, REST-based services might not have any WSDL or WADL description at all. In this case, a service description that notes its usage is inevitable.
This is the usage pattern:
- A web service exists and either its developer or another TextGrid developer would like to make it available in TextGrid
- The developer writes a TextGrid service description for that service. Since there is not yet a GUI, an XML or plain text editor should be used. The necessary steps include:
- identify descriptive metadata about the service
- point to a WSDL service description (soap) or a service endpoint (rest)
- identify the service's operation to be used
- separate the inputs proper from configuration parameters
- give some details about each input, configuration, and output parameter
- provide possible example values for the configuration parameters, either inline or using a URI pointing to a TextGridObject
- The service description will be saved as a TextGridObject document in one of the developer's projects
- Verify the correctness of the service description by creating a workflow that contains it and running this workflow
- Once the service proves stable and usable, the description can be published and the TextGrid core team can be asked to list it among the approved services
During the creation of a workflow, many settings of a service description are evaluated by the workflow wizard. The TextGridURIs of the service descriptions will be included into the TGWF workflow document. They will be retrieved again upon submission of the workflow job, influencing the structure and the data of the Petri Net (GWDL document) that the workflow engine accepts.
Creation
- Go to File - New - Element - XML Document.
- In the following dialogue, select a project and type "TextGrid Service Description (text/tg.servicedescription+xml)", click "Next", enter a title, click "Finish".
- The TextGrid XML Editor will open with a service description skeleton
- Fill in the skeleton provided. You can use both the XML Editor's Design view and the Source view. However, choosing the Source view will allow you to to use the documentation annotation of the underlying Schema file which pops up when the mouse pointer rests on an XML tag.
Technical Details
XSD: The XML Schema file includes documentation annotation and can be found here: https://develop.sub.uni-goettingen.de/repos/textgrid/trunk/lab/base/info.textgrid.lab.workflow/resources/ServiceDescription.xsd
Element Details: In addition to the documentation in the XSD file, more details about the contents of a Service Description are listed here.
- During workflow composition, only data in service/descriptivedata/name, service/inputs, service/outputs and service/configparameters are being evaluated
- During workflow submission, both these data and service/technicaldata are being evaluated
- Since CRUD is returning TextGrid documents Base64-encoded, all input and output values MUST be of type xsd:base64Binary
- service/technicaldata:
- type: can be currently one of "soap" or "rest"
- operation:
- for soap, the name of the desired operation out of the operations as listed in the WSDL.
- for rest, one of GET or POST (currently the only supported operations)
- descriptionlocation: the "uri" attribute must be filled with
- for soap, the address of the WSDL. Note: the WSDL must be self-sufficient, i.e. the workflow engine will address the service endpoint listed therein and there is no possibility to pass the engine a different endpoint. Both the WSDL and the service location within the WSDL must be accessible from the workflow engine.
- for rest, the invariable service endpoint. It usually has a form such as "http://server.example.org/service/operation?". It can contain placeholders marked with an "@" sign for parameters, such as "http://example.org/items/@id/show?"
- targetnamespace: this applies for soap services and describes the target namespace of the XML schema associated with the service.
- Set the usetns attribute to true to tell the Workflow Engine that the message parameters should be prepended with the targetNamespace given by the uri attribute.
- Hint: set usetns to true if the schema definition part in the WSDL has elementFormDefault="qualified".
- If you interact with a Web Service written in a namespace-ignorant language (such as PHP, Python, Perl, or Tcl), usetns should perhaps be false.
- service/inputs is for the proper input parameters for the service, i.e. the parameters that are variable depending upon the actual objects being processed
- @multiple
- "false" would mean that the this input took one data block/URI at a time and consumed them in a queue. In the GUI, this is called "one by one".
- "true" means that the service takes possibly many arguments at once, as in a collate service. Those arguments will not be consumed by the engine and are ready each time the service is being called. In the GUI, this is called "pooled"
- There MUST be at least one input with multiple="false" such that the engine has a finite queue to process.
- @crud="true" would mean that the service knew how to read/readmetadata from TG-crud and this input param accepted a URI
- @param is the name of the element / parameter as given in the service description document
- @name is a human-readable string which will be displayed so that users will know what this parameter is about
- @mimetypes are the mime type(s), separated by commas, of the documents accepted by this service for this parameter
- @optional: set to true if this parameter can be omitted
- for rest, one special input can have the name "POSTBODY". This will then be handed to the server unchanged in the POST call
- @multiple
- service/outputs are the output parameters from the Service.
- Every output parameter must produce some result, i.e. there will either be a new TextGridObject document if the service is the last one in the workflow chain, or must be connected to an input parameter of some service further down the chain.
- @crud="true" would mean that the service knew how to create a new TGO via Crud and this output param returned a URI
- for the meaning of the rest, see inputs
- service/configparameters are those input parameters that are constant throughout the workflow
- Each parameter will have some example values which can be used directly or serve as a starting point for modification by the user
- @needsB64encoding="true" tells the editor to encode the text given by the user in Base64 before handing it over to the service
- for the meaning of the rest, see inputs
- there MUST be at least one <examplevalue> sub-element for each <configparameter>
- the list of example values is meant as a hint to the user and not as an exclusive list, i.e. it can be extended by users in their workflows
- Attributes of example values:
- @id will be used by the workflow editor to remember this value in workflow documents
- @name is a human readable name for display, e.g. "The FooMatic value"
- @default=true if this is the default value for this configuration parameter. Please specify EXACTLY ONE default value (this will not be checked by the schema)
- @inline
- true if the element's content is to be used directly as value
- false to specify a TextGridObject URI in the element's content
- Inline parameter values will be written as-is, i.e. as strings. If they are XML, please provide some namespace such that they do not inherit the servicedescription namespace.